智能论文笔记

卷积自动编码器经过翼型空气动力模拟数据库进行训练，并根据整体准确性和解释性进行评估。目的是预测失速并研究自动编码器区分翼型压力分布的线性和非线性响应的能力，以与攻击角度变化。在对学习基础架构进行敏感性分析之后，我们研究了针对极端压缩率的自动编码器确定的潜在空间，即非常低维的重建。我们还提出了一种使用解码器来生成新的合成翼型几何形状和空气动力溶液的策略，该策略是通过自动编码器学到的潜在表示中的插值和推断。

translated by 谷歌翻译

Transformers learn in-context by gradient descent

Johannes von Oswald , Eyvind Niklasson , Ettore Randazzo , João Sacramento , Alexander Mordvintsev , Andrey Zhmoginov , Max Vladymyrov

分类：机器学习 | 人工智能 | 自然语言处理

2022-12-15

Transformers have become the state-of-the-art neural network architecture across numerous domains of machine learning. This is partly due to their celebrated ability to transfer and to learn in-context based on few examples. Nevertheless, the mechanisms by which Transformers become in-context learners are not well understood and remain mostly an intuition. Here, we argue that training Transformers on auto-regressive tasks can be closely related to well-known gradient-based meta-learning formulations. We start by providing a simple weight construction that shows the equivalence of data transformations induced by 1) a single linear self-attention layer and by 2) gradient-descent (GD) on a regression loss. Motivated by that construction, we show empirically that when training self-attention-only Transformers on simple regression tasks either the models learned by GD and Transformers show great similarity or, remarkably, the weights found by optimization match the construction. Thus we show how trained Transformers implement gradient descent in their forward pass. This allows us, at least in the domain of regression problems, to mechanistically understand the inner workings of optimized Transformers that learn in-context. Furthermore, we identify how Transformers surpass plain gradient descent by an iterative curvature correction and learn linear models on deep data representations to solve non-linear regression tasks. Finally, we discuss intriguing parallels to a mechanism identified to be crucial for in-context learning termed induction-head (Olsson et al., 2022) and show how it could be understood as a specific case of in-context learning by gradient descent learning within Transformers.

translated by 谷歌翻译

背景信息：在过去几年中，机器学习（ML）一直是许多创新的核心。然而，包括在所谓的“安全关键”系统中，例如汽车或航空的系统已经被证明是非常具有挑战性的，因为ML的范式转变为ML带来完全改变传统认证方法。目的：本文旨在阐明与ML为基础的安全关键系统认证有关的挑战，以及文献中提出的解决方案，以解决它们，回答问题的问题如何证明基于机器学习的安全关键系统？'方法：我们开展2015年至2020年至2020年之间发布的研究论文的系统文献综述（SLR），涵盖了与ML系统认证有关的主题。总共确定了217篇论文涵盖了主题，被认为是ML认证的主要支柱：鲁棒性，不确定性，解释性，验证，安全强化学习和直接认证。我们分析了每个子场的主要趋势和问题，并提取了提取的论文的总结。结果：单反结果突出了社区对该主题的热情，以及在数据集和模型类型方面缺乏多样性。它还强调需要进一步发展学术界和行业之间的联系，以加深域名研究。最后，它还说明了必须在上面提到的主要支柱之间建立连接的必要性，这些主要柱主要主要研究。结论：我们强调了目前部署的努力，以实现ML基于ML的软件系统，并讨论了一些未来的研究方向。

translated by 谷歌翻译